Book/Dissertation / PhD Thesis FZJ-2018-02625

http://join2-wiki.gsi.de/foswiki/pub/Main/Artwork/join2_logo100x88.png
Management of Electrophysiological Data & Metadata - Making complex experiments accessible to yourself and others



2018
Forschungszentrum Jülich GmbH Zentralbibliothek, Verlag Jülich
ISBN: 978-3-95806-311-2

Jülich : Forschungszentrum Jülich GmbH Zentralbibliothek, Verlag, Schriften des Forschungszentrums Jülich. Reihe Schlüsseltechnologien / Key Technologies 167, 182 S. () = RWTH Aachen, Diss., 2017

Please use a persistent id in citations:  

Abstract: As neuroscientists, we are obligated to guarantee the reliability of our research workflows and results. For this reason, most neuroscientific, peer-reviewed journals require, besides a full description of the new scientific findings and used methods, at least also a brief summary of the used data. For a completely open scientific inquiry and to further promote scientific progress as required by most national funding agencies, it would be best to share the raw data along with an analysis paper or independently as a standalone data publication. Unfortunately, especially the neuroscientific community is hesitant to share own research data with third parties, because guidelines, tools and support for the publishing authors to provide data and corresponding adequate information on the experiment are generally hard to formalize and often missing. As a consequence, published scientific results remain unreproducible by other researchers. Along an example of a complex electrophysiological experiment which was conducted by external collaboration partners, I will demonstrate how to share and publish data, but also identify the reasons why researchers, in particular experimentalists in electrophysiology, are hesitant to try the same. For this, I first provide a data descriptor of the experiment following the guidelines of a pure data journal from the Nature publishing group, called Scientific Data. According to the journal guidelines, I describe all information necessary to be able to understand the setup and workflow of the experiment as well as the minimum information necessary to be able to work with the corresponding datasets. The latter requires the provision of a robust data loading routine. To guarantee the access to the data of the experiment, I implemented a commonly usable loading routine for the data formats of the used data acquisition (DAQ) system from Blackrock Microsystems(Cerebus DAQ), and published it as part of an open source data framework, called Neo. Neo has the advantage of representing data in standardized structures that allow researchers to use common analysis routines on different data formats. In addition, it is possible to further annotate the Neo data structures with experiment-specific information on the data. To automatically integrate such information, termed metadata, it is best to have them organised in a machine-readable format. Although several software solutions for such metadata formats exist, they are usually not tested for complex use cases, such as the example experiment. In most cases, they only provide the framework itself as a standardized metadata representation or specification, and no solutions for how to actually compile auseful metadata collection. For the example experiment, I chose a metadata framework, called open metadata Markup Language (odML), an open source project developed by the German Node (GNode) of the International Neuroinformatics Coordination Facility (INCF). In the second part of the thesis, I demonstrate how to organize metadata for the experiment, to be able to compile and use a corresponding odML metadata collection. To facilitate the compilation process for my collaborators, I developed a Python package, called odMLtables, which facilitates the access to the odML framework by an algorithmic transformation of odML into a spreadsheet format (csv or xls) and back. In addition, I provided a complete workflow for collecting and storing the metadata of the experiment into a comprehensive odML-file collection. Furthermore, I provided a specified data loading routine that automatically annotates the data structures with the corresponding metadata of the collection. The latter improves the workflow in the course of neuroscientific analyses of the data from the example experiment, as demonstrated in the last part of my thesis. In summary, I show that the preparations to properly share research data within a scientific collaboration are cumbersome and time consuming, but essential for successfully publishing data and analysis results for a broader audience of users. To promote data sharing within the neuroscientific community and to provide a better foundation for reproducible research, my thesis offers a coherent strategy for managing electrophysiological data and metadata using a well selected set of available technologies.


Note: RWTH Aachen, Diss., 2017

Contributing Institute(s):
  1. Computational and Systems Neuroscience (INM-6)
Research Program(s):
  1. 899 - ohne Topic (POF3-899) (POF3-899)

Appears in the scientific report 2018
Database coverage:
Creative Commons Attribution CC BY 4.0 ; OpenAccess
Click to display QR Code for this record

The record appears in these collections:
Institute Collections > IAS > IAS-6
Institute Collections > INM > INM-6
Document types > Theses > Ph.D. Theses
Document types > Books > Books
Workflow collections > Public records
JuOSC (Juelich Open Science Collection)
Publications database
Open Access

 Record created 2018-04-25, last modified 2024-03-13